AITopics | request type

Collaborating Authors

request type

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism

Neural Information Processing SystemsJun-13-2026, 04:06:17 GMT

Multimodal large language models (MLLMs) extend LLMs to handle images, videos, and audio by incorporating feature extractors and projection modules. However, these additional components--combined with complex inference pipelines and heterogeneous workloads--introduce significant inference overhead. Therefore, efficiently serving MLLMs remains a major challenge. Current tightly coupled serving architectures struggle to distinguish between mixed request types or adapt parallelism strategies to different inference stages, leading to increased time-to-first-token (TTFT) and poor resource utilization. To address this, we introduce Elastic Multimodal Parallelism (EMP), a new serving paradigm that elastically adapts to resource heterogeneity across request types and inference stages. Building upon EMP, we develop ElasticMM, an MLLM serving system that (1) separates requests into independent modality groups with dynamic resource allocation via a modality-aware load balancer; (2) decouples inference stages and enables parallelism adjustment and adaptive scaling via elastic partition scheduling; and (3) improves inference efficiency through unified multimodal prefix caching and non-blocking encoding. Experiments on diverse real-world datasets show that ElasticMM outperforms state-of-the-art (SOTA) serving systems, reducing TTFT by up to 4.2$\times$ and achieving 3.2-4.5$\times$

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)

Add feedback

LLM Agents for Generating Microservice-based Applications: how complex is your specification?

Yellin, Daniel M.

arXiv.org Artificial IntelligenceOct-28-2025

In this paper we evaluate the capabilities of LLM Agents in generating code for real-world problems. Specifically, we explore code synthesis for microservice-based applications, a widely used architectural pattern for building applications. We define a standard template for specifying these applications, and we propose a metric for scoring the difficulty of a specification. The higher the score, the more difficult it is to generate code for the specification. Our experimental results show that agents using strong LLMs (like GPT-3o-mini) do fairly well on medium difficulty specifications but do poorly on those of higher difficulty levels. This is due to more intricate business logic, a greater use of external services, database integration and inclusion of non-functional capabilities such as authentication. We analyzed the errors in LLM-synthesized code and report on the key challenges LLM Agents face in generating code for these specifications. Finally, we show that using a fine-grained approach to code generation improves the correctness of the generated code.

experiment, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.20119

Country:

Asia (0.46)
Europe > Austria (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Consumer Products & Services > Restaurants (0.47)
Information Technology (0.35)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Taxonomy of User Needs and Actions

Shelby, Renee, Diaz, Fernando, Prabhakaran, Vinodkumar

arXiv.org Artificial IntelligenceOct-13-2025

The growing ubiquity of conversational AI highlights the need for frameworks that capture not only users' instrumental goals but also the situated, adaptive, and social practices through which they achieve them. Existing taxonomies of conversational behavior either overgeneralize, remain domain-specific, or reduce interactions to narrow dialogue functions. To address this gap, we introduce the Taxonomy of User Needs and Actions (TUNA), an empirically grounded framework developed through iterative qualitative analysis of 1193 human-AI conversations, supplemented by theoretical review and validation across diverse contexts. TUNA organizes user actions into a three-level hierarchy encompassing behaviors associated with information seeking, synthesis, procedural guidance, content creation, social interaction, and meta-conversation. By centering user agency and appropriation practices, TUNA enables multi-scale evaluation, supports policy harmonization across products, and provides a backbone for layering domain-specific taxonomies. This work contributes a systematic vocabulary for describing AI use, advancing both scholarly understanding and practical design of safer, more responsive, and more accountable conversational systems.

data mining, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.06124

Country:

Asia > Middle East (0.92)
North America > United States > Massachusetts (0.67)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
(2 more...)

Genre:

Overview (0.92)
Personal > Interview (0.46)

Industry:

Education (1.00)
Government (0.92)
Law (0.67)
(3 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(7 more...)

Add feedback

MuST2-Learn: Multi-view Spatial-Temporal-Type Learning for Heterogeneous Municipal Service Time Estimation

Asif, Nadia, Hong, Zhiqing, Ren, Shaogang, Zhang, Xiaonan, Shang, Xiaojun, Yuan, Yukun

arXiv.org Artificial IntelligenceAug-25-2025

Non-emergency municipal services such as city 311 systems have been widely implemented across cities in Canada and the United States to enhance residents' quality of life. These systems enable residents to report issues, e.g., noise complaints, missed garbage collection, and potholes, via phone calls, mobile applications, or webpages. However, residents are often given limited information about when their service requests will be addressed, which can reduce transparency, lower resident satisfaction, and increase the number of follow-up inquiries. Predicting the service time for municipal service requests is challenging due to several complex factors: dynamic spatial-temporal correlations, underlying interactions among heterogeneous service request types, and high variation in service duration even within the same request category. In this work, we propose MuST2-Learn: a Multi-view Spatial-Temporal-Type Learning framework designed to address the aforementioned challenges by jointly modeling spatial, temporal, and service type dimensions. In detail, it incorporates an inter-type encoder to capture relationships among heterogeneous service request types and an intra-type variation encoder to model service time variation within homogeneous types. In addition, a spatiotemporal encoder is integrated to capture spatial and temporal correlations in each request type. The proposed framework is evaluated with extensive experiments using two real-world datasets. The results show that MuST2-Learn reduces mean absolute error by at least 32.5%, which outperforms state-of-the-art methods.

artificial intelligence, machine learning, request type, (21 more...)

arXiv.org Artificial Intelligence

2508.16503

Country:

North America > United States > Tennessee (0.28)
North America > United States > New Jersey (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.16)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Water & Waste Management > Solid Waste Management (0.88)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Learning Virtual Machine Scheduling in Cloud Computing through Language Agents

Wu, JieHao, Wang, Ziwei, Sheng, Junjie, Li, Wenhao, Wang, Xiangfeng, Luo, Jun

arXiv.org Artificial IntelligenceMay-20-2025

In cloud services, virtual machine (VM) scheduling is a typical Online Dynamic Multidimensional Bin Packing (ODMBP) problem, characterized by large-scale complexity and fluctuating demands. Traditional optimization methods struggle to adapt to real-time changes, domain-expert-designed heuristic approaches suffer from rigid strategies, and existing learning-based methods often lack generalizability and interpretability. To address these limitations, this paper proposes a hierarchical language agent framework named MiCo, which provides a large language model (LLM)-driven heuristic design paradigm for solving ODMBP. Specifically, ODMBP is formulated as a Semi-Markov Decision Process with Options (SMDP-Option), enabling dynamic scheduling through a two-stage architecture, i.e., Option Miner and Option Composer. Option Miner utilizes LLMs to discover diverse and useful non-context-aware strategies by interacting with constructed environments. Option Composer employs LLMs to discover a composing strategy that integrates the non-context-aware strategies with the contextual ones. Extensive experiments on real-world enterprise datasets demonstrate that MiCo achieves a 96.9\% competitive ratio in large-scale scenarios involving more than 10,000 virtual machines. It maintains high performance even under nonstationary request flows and diverse configurations, thus validating its effectiveness in complex and large-scale cloud environments.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.10117

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Services (1.00)
Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Causal AI-based Root Cause Identification: Research to Practice at Scale

Jha, Saurabh, Rahane, Ameet, Shwartz, Laura, Palaci-Olgun, Marc, Bagehorn, Frank, Rios, Jesus, Stingaciu, Dan, Kattinakere, Ragu, Banerjee, Debasish

arXiv.org Artificial IntelligenceFeb-25-2025

Modern applications are increasingly built as vast, intricate, distributed systems. These systems comprise various software modules, often developed by different teams using different programming languages and deployed across hundreds to thousands of machines, sometimes spanning multiple data centers. Given the ir scale and complexity, these applications are often designed to tolerate failures and performance issues through inbuilt failure recovery techniques (e.g., hardware or software redundancy) or extern al methods (e.g., health check - based restarts). Computer systems experience frequent failures despite every effort: performance degradations and violations of reliability and K ey Performance Indicators (K PI s) are inevitable. These failures, depending on their nature, can lead to catastrophic incidents impacting critical systems and customers. Swift and accurate root cause identification is thus essential to avert significant incidents impacting both service quality and end users. In this complex landscape, observability platforms that provide deep insights into system behavior and help identify performance bottlenecks are not just helpful -- they are essential for maintaining reliability, ensuring optimal performance, and quickly resolving issues in production. The ability to reason a bout these systems in real - time is critical to ensuring the scalability and stability of modern services. To aid in these investigations, observability platforms that collect various telemetry data t o inform about application behavior and its underlying infrastructure are getting popular .

instana, probability, request type, (15 more...)

arXiv.org Artificial Intelligence

2502.1824

Country: Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.81)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.67)
(3 more...)

Add feedback

Better Think with Tables: Leveraging Tables to Enhance Large Language Model Comprehension

Oh, Jio, Heo, Geon, Oh, Seungjun, Wang, Jindong, Xie, Xing, Whang, Steven Euijong

arXiv.org Artificial IntelligenceDec-22-2024

Despite the recent advancement of Large Langauge Models (LLMs), they struggle with complex queries often involving multiple conditions, common in real-world scenarios. We propose Thinking with Tables, a technique that assists LLMs to leverage tables for intermediate thinking aligning with human cognitive behavior. By introducing a pre-instruction that triggers an LLM to organize information in tables, our approach achieves a 40.29\% average relative performance increase, higher robustness, and show generalizability to different requests, conditions, or scenarios. We additionally show the influence of data structuredness for the model by comparing results from four distinct structuring levels that we introduce.

information, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.17189

Country:

South America > Argentina (0.06)
Europe > Germany (0.05)
Europe > Portugal (0.05)
(6 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency

Stojkovic, Jovan, Zhang, Chaojie, Goiri, Íñigo, Torrellas, Josep, Choukse, Esha

arXiv.org Artificial IntelligenceAug-1-2024

The rapid evolution and widespread adoption of generative large language models (LLMs) have made them a pivotal workload in various applications. Today, LLM inference clusters receive a large number of queries with strict Service Level Objectives (SLOs). To achieve the desired performance, these models execute on power-hungry GPUs causing the inference clusters to consume large amount of energy and, consequently, result in excessive carbon emissions. Fortunately, we find that there is a great opportunity to exploit the heterogeneity in inference compute properties and fluctuations in inference workloads, to significantly improve energy-efficiency. However, such a diverse and dynamic environment creates a large search-space where different system configurations (e.g., number of instances, model parallelism, and GPU frequency) translate into different energy-performance trade-offs. To address these challenges, we propose DynamoLLM, the first energy-management framework for LLM inference environments. DynamoLLM automatically and dynamically reconfigures the inference cluster to optimize for energy and cost of LLM serving under the service's performance SLOs. We show that at a service-level, DynamoLLM conserves 53% energy and 38% operational carbon emissions, and reduces 61% cost to the customer, while meeting the latency SLOs.

configuration, frequency, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2408.00741

Country:

North America > United States > Illinois (0.04)
Asia (0.04)

Genre: Research Report (0.50)

Industry:

Energy (1.00)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

ChatGPT Doesn't Trust Chargers Fans: Guardrail Sensitivity in Context

Li, Victoria R., Chen, Yida, Saphra, Naomi

arXiv.org Artificial IntelligenceJul-10-2024

While the biases of language models in production are extensively documented, the biases of their guardrails have been neglected. This paper studies how contextual information about the user influences the likelihood of an LLM to refuse to execute a request. By generating user biographies that offer ideological and demographic information, we find a number of biases in guardrail sensitivity on GPT-3.5. Younger, female, and Asian-American personas are more likely to trigger a refusal guardrail when requesting censored or illegal information. Guardrails are also sycophantic, refusing to comply with requests for a political position the user is likely to disagree with. We find that certain identity groups and seemingly innocuous information, e.g., sports fandom, can elicit changes in guardrail sensitivity similar to direct statements of political ideology. For each demographic category and even for American football team fandom, we find that ChatGPT appears to infer a likely political ideology and modify guardrail behavior accordingly.

guardrail, information, persona, (14 more...)

arXiv.org Artificial Intelligence

2407.06866

Country:

North America > United States > Florida > Hillsborough County > Tampa (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.05)
North America > United States > Texas (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback